The Global Greats: Nobel Laureates

- By Sakshat Rao

The Nobel Prize is an accomplishment that symbolizes utmost contribution towards the betterment of society. For several years, people have performed incredible work through scientific discoveries, brilliant literature and even inspiring peace-making. And the nobel prize is an award that represents the gratitude that the world has towards such individuals and organizations.

For this notebook, I wanted to dive deeper into the people who have won Nobel Prizes, called Nobel Laureates. I wanted to explore interesting insights about such individuals and visualize them through beautiful graphs, plots and maps. So let's dive in!

Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)

── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──

 ggplot2 3.3.3      purrr   0.3.4
 tibble  3.1.1      dplyr   1.0.5
 tidyr   1.1.3      stringr 1.4.0
 readr   1.4.0      forcats 0.5.0

── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
 dplyr::filter() masks stats::filter()
 dplyr::lag()    masks stats::lag()

Loading required package: magrittr


Attaching package: ‘magrittr’


The following object is masked from ‘package:purrr’:

    set_names


The following object is masked from ‘package:tidyr’:

    extract


Loading required package: sp

rgdal: version: 1.4-8, (SVN revision 845)
 Geospatial Data Abstraction Library extensions to R successfully loaded
 Loaded GDAL runtime: GDAL 2.4.0, released 2018/12/14
 Path to GDAL shared files: /usr/share/gdal
 GDAL binary built with GEOS: TRUE 
 Loaded PROJ.4 runtime: Rel. 5.2.0, September 15th, 2018, [PJ_VERSION: 520]
 Path to PROJ.4 shared files: (autodetected)
 Linking to sp version: 1.4-1 


Attaching package: ‘reshape’


The following object is masked from ‘package:dplyr’:

    rename


The following objects are masked from ‘package:tidyr’:

    expand, smiths



Attaching package: ‘reshape2’


The following objects are masked from ‘package:reshape’:

    colsplit, melt, recast


The following object is masked from ‘package:tidyr’:

    smiths



Attaching package: ‘zoo’


The following objects are masked from ‘package:base’:

    as.Date, as.Date.numeric


A data.frame: 6 × 6
NameYearCountriesCategoryGenderBirthYr
<chr><dbl><chr><fct><fct><int>
1Charles K. Kao 2009China, Republic of, Hong Kong, United Kingdom, United StatesPhysics Man1933
2Michael Levitt 2013Israel, South Africa, United Kingdom, United States Chemistry Man1947
3Leonid Hurwicz 2007Poland, Russia and Soviet Union, United States Economics Man1917
4Vladimir Prelog 1975Bosnia and Herzegovina, Croatia, Switzerland Chemistry Man1906
5Sydney Brenner 2002South Africa, United Kingdom, United States Physiology or MedicineMan1927
6Samuel C. C. Ting1976Taiwan (China, Republic of), United States Physics Man1936

Over the course of my analysis, I wanted to answer these questions. For some of them, I had a basic intuition about what the answer would be, but I wanted to see it for myself through the data. And for some questions, I had no clue about what to expect!

Global Distribution of Nobel Laureates

  1. How are Nobel Laureates globally distributed?
  2. Which continents boast more Nobel Laureates than others?
  3. What is the value of the 'Nobel Laureate density' metric for different countries?
  4. What is the global distribution of Nobel Laureates for different awards?

Women Nobel Laureates

  1. How represented are women among Nobel Laureates?
  2. Are there certain awards which have significantly better representation of women than others?
  3. Has the representation of women increased over time?

Age of Nobel Laureates

  1. What is the average age of Nobel Laureates when they receive their Nobel Prize?
  2. What is the general trend of age of Nobel Laureates with time?
  3. Are there awards/genders which boast significantly younger Nobel Laureates than others?

Popularity of Nobel Laureates

  1. Who are the most popular/celebrated/searched-for Nobel Laureates?
  2. Do Nobel Laureates experience a significant boost in popularity during the time they are awarded the Nobel prize?

Global Distribution

How are Nobel Laureates globally distributed?

This is a basic question I wanted to know about. How are Nobel Laureates distributed around the world? Is the distribution uniform among countries? Is there some kind of geographical biasness? Do developed countries dominate the scene? Let's find out.

Now, we cannot expect all Nobel Laureates to belong to a single country. Many laureates were immigrants while some have multiple citizenships. Thus, to ensure equality among countries, we assign contributions to them.

For example, if Nobel Laureate 'A' is from India, then India gets a contribution of 1. If another Nobel Laureate 'B' is associated with India, USA and Canada, then these three countries get a contribution of 0.33 each. Eventually, we sum up these different contributions for different countries and map it.

As shown, USA has incredible dominance when it comes to Nobel Laureate global distribution. No other country (apart from UK and Germany) really comes close to it. This is not only because of the incredible technology, development, education and state-of-living in the States, but also because of the so-called 'brain drain' where many families immigrated from different countries to the States and prospered there.

Which continents boast more Nobel Laureates than others?

Now let's take the same concept and apply it to continents. Based on the previous graph, I expected North America to be dominant based on how dominant USA was at the country-level.

Rather surprisingly, North America & Europe have comparable contributions (in fact, Europe just pips North America!). It is surprising because it seems that the small contributions from the several European countries add up to pip the extremely large single contribution of the USA.

What is the value of the decade-wise 'Nobel Laureate density' metric for different countries?

After analyzing the contributions of these countries, one of my doubts was "If a country has a larger population, wouldn't there be a statistically higher chance of having more Nobel Laureates?". For example, Greece has a smaller population than USA and maybe because of that, it is not able to have as many Nobel Laureates. Seems like a silly excuse, but I wanted to see how efficient countries were at producing Nobel Laureates from its population.

For this task, I define a simple metric called Nobel Laureate density which is defined as the number of Nobel Laureates per million citizens. But since the population of a country is dynamic and ever-changing, I define it for a given period of time, which in my case is 20 years.

For example, if India had 3 Nobel Laureates in the twenty years between 1990-2010 with an everage population of 50,000,000 in that period, then I define Nobel Laureate density between 1990-2010 as (3/50) = 0.06.

Quite interesting results! Here I have plotted the Nobel Laureate density of the top 10 countries. As it turns out, Sweden and Switzerland seem to have incredibly high Nobel Laureate densities which means that these countries perform very well based on their small populations. They are followed by UK and USA which are more populous but also have more Nobel Laureates.

Fun Fact: Sweden between the 1970s and 1990s had 1 Nobel Laureate per million citizens on average - imagine if countries like India or China (with their population in billions) could achieve that kind of efficiency!

What is the global distribution of Nobel Laureates for different awards?

We looked at how all Nobel Laureates were distributed globally, now let's take a look at how Nobel Laureates for different awards are distributed globally.

Global distribution for Nobel Laureates in Physics, Chemistry, Medicine and Economics are more-or-less similar to the overall global distribution. Where we find a change is for Peace and Literature. These two awards seem to have a very distributed representation around the world. Although USA still dominates these awards as well, but we can see contributions from many more Asian, African and South American countries.

One possible reason for this behaviour could be as follows - Physics, Chemistry, Medicine and Economics are probably more technical fields because of which developed countries like USA, UK and Germany dominate the scene. However, Literature and Peace are not restricted by how technologically advanced the country is, which is probably why we find more representation, sometimes even from countries with low development.

Women Nobel Laureates

How represented are women among Nobel Laureates?

This is a question that is gaining more and more traction, especially today when people are starting to realize there needs to be equity for women in different fields. Let us see whether women are well represented among Nobel Laureates.

[1] "Average Women Percentage Representation: 6.16%"
Using Name as value column: use value.var to override.

[1] "Percentage of years when no woman was awarded the Nobel Prize: 65.8%"
[1] "Maximum Woman Representation attained was 38.5% in 2009"

As can be seen, the representation of women among Nobel Laureates is pathetic, a mere 6%. For more than half of the history of Nobel Prizes, women have not even been awarded a single prize. From the graph, one can notice that, on average, men can expect maybe 7-8 Nobel prizes per year whereas women would probably receive 0-1 Nobel prize.

There needs to be improvement in this regard. I do not mean to say that women should forcefully be given more Nobel Prizes, but it is fairly obvious that the nomination and selection of Nobel Prizes has not been just for both genders.

Are there certain awards which have better representation of women than others?

To investigate more into this apparent gender biasness among Nobel Laureates, let us look at the woman representation for different Nobel Prize categories.

Using Name as value column: use value.var to override.

It is evident from the graph that Nobel Prizes for Peace & Literature have much more women representation than others. This particular observation seemed similar to another of our previous insights - that Peace & Literature also have much more representation among countries.

This made me wonder whether there is some connection between how countries and women are represented since only Peace and Literature have higher country and women representations.

Investigating why Nobel Prizes in Peace & Literature have better global distribution and also higher woman representation?

There could be two possibilities in my opinion -

  1. This is a coincidence. It is by chance that country representation is more in those Nobel Prize categories which have more women representation as well.
  2. There exists some relationship that we have to analyze.

One possible relationship that I could think of was as follows - We know that the global distribution of Nobel Laureates is dominated by a few 'dominant countries' like the USA, UK and Germany. Only for Peace and Literature do 'other countries' like Russia, China, India and South Africa also start having better representation. So could it be possible that the higher woman representation for Peace and Literature is because of these 'other countries'? Maybe these 'other countries' provide a higher women representation while the 'dominant countries' provide a very low women representation? Let us find out.

P-value for difference in women representation between 'dominant' and 'other' countries:
	Physics = 0.484
	Chemistry = 0.916
	Economics = 0.294
	Physiology or Medicine = 0.139
	Peace = 0.00356 *
	Literature = 0.834

Plotted above are the women representation for different countries and categories. We are interested in the Peace & Literature categories and especially in comparing between the 'dominant countries' (the top 10 countries at the top of the plot) and the 'other countries' (the other countries at the bottom of the plot).

We can clearly see more yellow spots towards the bottom of the plot, indicating that the 'other countries' have higher women representation than the 'dominant countries'. This difference is much more evident for the Peace category. After calculating the p-values, we can confirm that the 'other countries' have statistically higher women representation than the 'dominant countries', but only for the Peace category.

This probably indicates that the dominant countries like USA, UK, Germany, etc. could be the reason why women representation is low. Another way of expressing this is that the dominant countries drive women representation down while the 'other countries' do their best to improve the women representation (atleast in the Peace category).

Has the representation of women increased over time?

The biasness against women is a matter of the past. Have we done anything to ensure a better future? Let us look at the trend of women representation with time.

`geom_smooth()` using formula 'y ~ x'

These are smoothed graphs of the trend. Again, peace and literature are the categories where women representation is showing a high upward slope while for other categories, it is indeed showing slight improvement. If this trend continues, then I hope it will lead to a point of acceptable women representation among Nobel Laureates in the near future.

Age of Nobel Laureates

What is the average age of Nobel Laureates when they receive their Nobel Prize?

I always had this assumption that Nobel Prizes were only given to old people. This was to ensure that they made a sizeable contribution to the society. A person in his/her 30s is obviously less likely to make more contributions than when he/she would be 60, right?

Warning message:
“Removed 11 rows containing non-finite values (stat_bin).”
Warning message:
“Removed 11 rows containing non-finite values (stat_boxplot).”

Turns out that my hunch is pretty true. The average age of Nobel Laureates is between 55-65 while the IQR is from approximately 50-70. There are obviously a few exceptions here like Malala Yousafzai who was awarded the Nobel prize at the age of 17! Also a nice thing to notice here is that the age distribution fits incredibly well to a Gaussian curve as shown.

What is the trend of age of Nobel Laureates with time?

Similar to previous analysis, let us identify the trend of Nobel Laureate age vs. time.

[1] "Avg. age is increasing by 0.124 every year!"
`geom_smooth()` using formula 'y ~ x'

As can be seen, the Nobel Laureate age seems to be slightly increase with time. In fact, it increases at an average rate of 0.124 per year. This means that since the start of the Nobel Prize in the 1900s, the average age has increased by almost 15 years!

Are there awards/genders which boast significantly younger Nobel Laureates than others?

Again, let us compare the age distribution with different award categories and genders.

  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = Age ~ Category + Gender, data = data)

$Category
                                         diff        lwr        upr     p adj
Economics-Chemistry                 8.1969282   3.674826 12.7190300 0.0000041
Literature-Chemistry                6.0119750   1.920762 10.1031880 0.0004255
Peace-Chemistry                     2.1337898  -2.073380  6.3409593 0.6972698
Physics-Chemistry                  -2.5403023  -5.996107  0.9155024 0.2885373
Physiology or Medicine-Chemistry   -0.2136661  -3.648182  3.2208502 0.9999756
Literature-Economics               -2.1849533  -7.114817  2.7449103 0.8036943
Peace-Economics                    -6.0631384 -11.089649 -1.0366282 0.0078460
Physics-Economics                 -10.7372305 -15.154030 -6.3204312 0.0000000
Physiology or Medicine-Economics   -8.4105943 -12.810757 -4.0104315 0.0000009
Peace-Literature                   -3.8781852  -8.520860  0.7644892 0.1623465
Physics-Literature                 -8.5522773 -12.526788 -4.5777664 0.0000000
Physiology or Medicine-Literature  -6.2256410 -10.181656 -2.2696263 0.0001146
Physics-Peace                      -4.6740921  -8.767866 -0.5803178 0.0146106
Physiology or Medicine-Peace       -2.3474559  -6.423275  1.7283635 0.5689711
Physiology or Medicine-Physics      2.3266362  -0.967998  5.6212704 0.3335200

$Gender
               diff       lwr        upr     p adj
Woman-Man -3.951091 -7.184324 -0.7178575 0.0166698

This fun-looking violin plot shows how age varies with category and gender. Do note that there have been only two female Nobel Laureates in Economics, because of which I wasn't able to show the corresponding violin plot due to insufficient data.

There wasn't any obvious trend that I was able to find in the violin plots. However, upon performing ANOVA analysis and Tukey tests, I was able to identify certain pairs of categories which have a significant difference in average age (the 'p adj' value should be <0.05). Further, the age difference between male and female Nobel Laureates was also found to be statistically significant.

Popularity of Nobel Laureates

Who are the most popular/celebrated/searched-for Nobel Laureates?

This is something that excited and fascinated me. I wanted to find interesting facts about how popular Nobel Laureates really are and how the Nobel prize affected their popularity.

To quantify popularity, I'm using the metric of Wikipedia Page views. The more views a person's wikipedia page gets, the more popular I assume that person is. Fair enough?

NOTE: The Wikipedia Page View metrics only go as far as July 2015. So we are essentially looking at how popular a person has been for the last five years or so.

So let's see who's the most famous Nobel Laureate - it's President Obama. He was awarded the Nobel Peace Prize in 2009 and did continue to be the president of the USA till 2017 by which time his popularity must have reached very high.

It is also worth noting that one might assume living celebrities like Obama to be more popular and relevant today than deceased celebrities. But that is not the case - Sir Winston Churchill, Albert Einstein, Martin Luther King Jr. and Theodore Roosevelt, though deceased, are still popular today.

Another really interesting aspect to observe in the plot is the category of Nobel Prize that the individuals won - most of them are Peace. This is probably because few Nobel Peace Prizes have been awarded to presidents, ministers, politicians and activists - people who need to be popular and relevant among the masses in order to do their job or spread awareness. On the other hand, fame/popularity is usually not of concern to scientists who can still do their job without being seen.

Do the Nobel Laureates experience a significant boost in popularity when they are awarded the Nobel prize?

This is also something that I do think about. Let's say tomorrow it is announced that I will win the Nobel Prize in Physiology or Medicine. Obviously, not a lot of people know about me (sob...), so they'll start searching me up on Google & Wikipedia, boosting my popularity. However, after the Award Ceremony, people will go back to not caring who I am (sob...).

This made me wonder whether Nobel Laureates experience a boost in popularity during the time period when they receive their Nobel Prize and does that boost in popularity eventually die off?

The plot above shows the trend of popularity of all Nobel Prize winners (after 2015) during their respective Nobel Prize ceremony. For example, in the case of a 2017 Nobel Laureate, we start tracking his/her popularity from August 2017 to April 2018. This time period is centered around December 2017, which is when the Nobel Prize would have been awarded to our Nobel Laureate.

To my delight, the general trend was exactly how I expected - a sudden spike in popularity followed by a return to normalcy. However, the spike was observed not in December (when the Nobel Prizes are actually awarded) but in October. Upon some research, I found out that although Nobel Prizes are awarded in December, the winners of the imminent Nobel Prizes are actually announced in October. This made sense, since immediately after the announcement in October, people will be curious to know who the soon-to-be Nobel Laureates are. And this curiousness will quickly fade away some time after the announcements.

What surprised me was how consistent this pattern was. As shown, barring a few exceptions, this pattern is exactly the same for almost all the Nobel Laureates after 2015!

Thank You!

Hope you enjoyed this analysis as much as I did. Feel free to go through my R codes if you wish. If you find any error in my approach or have any suggestion, do let me know. I hope this notebook atleast taught you something interesting/unique today!